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Detecting and identifying anomalies in a signal 



1 Introduction 
!•! Playability 

Besides appearance, functionaUty and prize, an important motivation to buy a 
certain drive is its ability to play back all sorts of discs without any probleins. 
In the highly competitive market of optical storage systems, the ability of a 
drive to play back more different discs can get a drive producer an edge on its 
competitors. In this contexA the term playabUity can be defined as follows. 

PlayahUity is the ability of a non-ideaX optical disc system to play 
back a disc without noticeable errors at the user side. 

This impHes that a data system should deliver error free information and that 
an audio/video system must be aUowed to perform error conceaiment to mask 
errors that caxmot be prevented or removed. 

1,2 Disturbances in optical disc systems 

We can identify one particular group of disturbances for an optical disc drive 
that has to do with the quaUty of the optical disc. This quaUty can severely 
deteriorate due to incorrect or incautious handling of the discs by the user or the 
quality is bad from the start when the discs are poorly produced. One can think 
of scratches, dirt spots and fingerprints that arise on the polycarbonate substrate 
or the anomalies and impurities that are included in the substrate layer. These 
latter become of increasing unportance again with the fast growing number of 
piracy discs entering the market. 

From now on, we will call the disc related features we mentioned above 
disc defects. These defects, that are locaUy present on a disc, wiU distort the 
reflection of the laser beam. Hence they result in abnormal photoelectric signals 
that in turn will affect the generation of HP and servo signals and the behavior 
of all drive elements relying on these signals. See also Figure 1. Hence we 
treat these disc defects as disturbances according to the formal definition given 
above. The HF signal is further influenced by the geometry of the impressed 
pits and the sequence in which they appear on the disc. Anomalies in this 
pit /land structure are of a different origin and hence they will be excluded from 
the group of disturbances called disc defects. The following definition of the 
term disc defect will be used throughout this report. 
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Figure 1: Example of a high-frequency signal (upper part) emd a radial error 
signal (lower part), influenced by a disc defect. 



Disc defects are those features locally present on or in an optical 
disc that result in different behavior of servo signals than what can 
be expected from the geometry of the information track and <fte dir 
mensions or shape of the disc. 

Note that phenomena such as eccentricity, lilt and skew are excluded by this 
definition. This because these features are directly related to the track geometry 
or the disc dimensions and shape. They can also be caused for instance the 
clamping of the disc inside the drive. 

1.3 Dealing with disc defects 

As already mentioned earlier there are methods to deal with disturbances that 
are influencing a system. An optical disc drive is equipped with several servo 
controllers that must assure the correct positioning of the laser spot on the in- 
formation track. The way in which the servo tries to achieve accurate tracking is 
by constantly adjusting the laser position through an actuator in order to keep 
the positioning error equal to zero. The control algorithm determines how the 
actuator should be driven, based on the momentary error. During the design of 
these controllers the specifications for tracking performance and playability with 
respect to disc defects inevitably lead to a trade-off. Namely, for accurate track- 
ing we want the controller to respond strongly to large position errors, which 
can be achieved by using a high bandwidth controller. Disc defects also result 
in, sometimes large, position errors. Since these errors are unreliable, ideally 
the controller should not respond to them at all, which implies a low bandwidth 
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controller. Even the use of more sophisticated coutroUets cannot improve the 
playabUity with respect to disc defects enough without sacrificing tracking per- 
formance This is sunply due to the fact that as soon as a disc defect occurs the 
photo-electric signals become severely distorted. These initial distortions can 
already influence the system in such a way that it stops functioning properly. 
More general we can state that for certain controlled systems disturbances exist 
that endanger the proper functioning of the system due to the severe response 
to these disturbances at their onset. _ 
Therefore an accurate and fast detection mechanism is needed that can mi- 
tiate proper actions in time and so preventing disturbances from influencing the 
system in a way that makes recovery impossible. A detection mechanian that 
is currently used monitors the total amount of reflected laser Ught. Whenever 
the reflected light intensity drops rapidly below a certain amount, a defect is 
detected. As a response the portion error signals are artificiaUy set to zero 
to prevent the controller from responding to false information. This detection 
however is not fast enough to prevent false position errors from influencing the 
controller at the start of the defect. This can aheady result in severe actuator 
drift that makes it hard to restart and continue tracking at the correct position 
when the defect is passed. Hence, in order to improve playabUity with respect 
to disc defects, next to improved position controllers also faster mechamsms for 
detection are needed in order to cross those disc defects. 

Next to timely information on the occurrence of a disturbance it is also 
valuable to know what type of disturbance is entering the system. With this 
information available it becomes possible to select those countermeasures that 
yield the best results for a particular ^urbance. Often however no adequate 
disturbance modelUng techniques axe avaolable and the number of possibte dis- 
turbances is infinitely large. In those cases the identification of disturbances can 
be based on a limited set of disturbance classes with whom new disturbances 
can be compared. Based on the outcome of this comparison, estimates can be 
made on the type of disturbance and its correspondhig properties, that can be 
used in selecting a proper strategy to counteract its influences. When.thrae 
disturbance classes are available they also can be used to design more specLhc 
and hence more accurate detection medianisms for disturbance classes. This 
will furthOT enhance the playability of optical disc drives. 



1.4 Objectives 

A precondition for the feasibility of 'design for playability' is the availability 
of fundamental knowledge and understanding of the limiting factors, possible 
mterrelations and ways to deal with them. In this report the focus will be on 
several aspects of one of the influencing factors, namely disc defects as stated 
in the following problem definition. 

How can fhe influence of disc defects be characterized and utOixed on 
the basic engine level in order to achieve playability improvement? 

Mtoe predsely, the objectives of this research are the following. 

• Development of a classification for disc defects based on available rfgnals. 

* Initiate improvements of defect detection. 
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However, this research will also concentrate on the implementation aspects of 
classification and detection methods in order to complete the outlines for fur- 
ther development. Although the research is conducted from the perspective of 
optical disc drives the objectives can be placed in the more general context of 
disturbances in data signals and controlled systems. Therefore the results will 
be generalized without loosing focus on optical disc drives where possible. 
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Figure 2: Artificial disc defects on standard test discs. 

2 Defect measurements 

An essential step in disc defect dasafication and detection is to understand how 
disc defects influence the various available signals in an optical disc dnve. We 
proceed pragmatically by doing experiments and analymng the results m order 
to gain these msights. Next to increased insight these experiments will also 
provide a vast amount of data that can easily be used in the research on disc 
defect classification and detection. , , ^ 

In Section 2.1 we deal with the issue of selecting representative disc defects 
and signals suitable for our experiments. In Section 2.2 we briefly address 
various practical issues concerning disc defect measurements. 

2.1 Disc defect signals 

2.1.1 Selection of defects and signals 

In ordSr to limit the number of experimaats and the amount of data that will be 
generated, we carefuUy select a representative set of disc defects and sig^s to 
monitor during the experiments. This selection of a limited number of different 
disc defects is necessary since it is practically impossible to cover all differrait 
defects possibly present on an optical disc. For example one can easUy mate 
a dozen of scratches on a disc and none of them would have exacUy the same 
shape. As a start we will use the various standardized defects that are used 
in optical disc drive developmrat for testing purposes. Although the difforent 
ways in which the drive generates and processes signals for CD and DVD discs 
we limit oursdves to DVD test discs. Not only are the results of this research 
of greater importance for the more sensitive DVD systems but conclusions can 
relatively easily be extrapolated to the CD case. 

The standard disc defects we will hivestigate are black dots of different sizes, 
normal and heavy fingerpnnta and artificial scratches. The first two artificial 
disc defects are produced by printing a single dot or a pattern of small dote on 
the substrate surfece. Fbr a heavy fingerprint the printed mesh is finer than for a 
normal fingerprint. This simulates a greasier, more spread out fin^rprint on the 
substrate surface. The scratch is made by deliberately dama^ a triangular 
area of the substrate surfece with an abrasive tool. See also Figure 2. 
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Another standard 'disc defect' is the so-called wedge. This defect is used 
to simulate a damaged information layer, for instance caused by a scratch on 
the very thin protective layer at the disc's label side. However this defect is 
generated by replacing the normal data pattern with a random pit-land pattern. 
This electronic representation of a bad information track falls outside the stated 
definition of disc defects and hence wedges will not be taken into account. 

With the above mentioned set of standardized disc defects already quit some 
different experiments can be conducted by varjring the size of the defect and 
measuring on different locations inside the defect. Pbr example we can measure 
signals when the outer edge of a black dot just falls within the laser beam and 
when the dot falls completely inside or even over the laser beam. In addition 
we extended the set of disc defects with some realistic ones that are made by 
deliberately abusing some new discs. These discs, now containing various radial 
and tangential scratches, dirt spots and fingerprints make the set of defects more 
representable and they can help to assess the validity of the artificial defects that 
only emulate reality. Another realistic but quit \mcommon disc defect is a so- 
called white dot At present no test discs are available for these defects that show 
a higher light reflection when compared to their surroundings. By placing some 
dots on a blanc DVD+RW disc before writing data to that area and removing 
the dots after the writing, these white dots can be simulated easily. 

The photodetector of the OPU generates a number of photocurrents, depend- 
ing on the exact configuration of the detector. Ideally we would measure these 
currents directly since this would rule out the influence of all signal processing 
steps and hence give us complete control on the way in which we generate the 
signals of interest &om these currents. Pbr DVD discs these currents however 
are not directly available on the test print and since these currents are in the 
order of several micro amperes in magnitude, measuring them directly on the 
engine PCB is not feasible. Signals that are available on the test print are the 
various servo signals such as the normalized radial and focus error (REN and 
FEN respectively), the normalized mirror (MIRN) signal that is a measure for 
the total amount of laser light received by the photodetector, the normalized tilt 
signal and of course the HF signals and various derivatives used in the engine 
decoder and data path. PVom preliminary experiments and previous research it 
became clear that both the MIRN and HF signal behavior show the most direct 
relation with incoming disc defects. We note however that the latter contains 
a high firequency component that carries the digital data. This component can 
be regarded as noise when investigating disc defect influences that occur in a 
lower fi:equency range. Next to the MIRN signal we also want to monitor the 
behavior of the REN and FBN signals since these signals are dkectly involved 
in the poi^tioning of the laser spot and reducing the influence of disc defects on 
this positioning is precisely what we are aiming at. It is clear that the REN and 
FEN signals are not reliable during the occurrence of a disc defect. The fact 
that the laser spot position is adjusted by a closed-loop control system increases 
this uncertainty. Hence care must be taken when etnalyzing measurements of 
these signals. 

2,2 Measuring defect signals 

The signal measurements are conducted at a sampling firequency fs of 500 kHz 
and for each signal 612 samples are taken. This results in a total measuring 
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time of 1.024 ms. The sample frequency is fixed for aU measurements to m^e 
comparison of measurements possible. Althou^ it is possible to elongate the 
measuring time by taking in more samples, all experiments are conducted with 
the same acquisition length to limit the amount of data and required processing 
time In order to start the measurements, we use the output signal of the 
currently implemented defect detector (DBFO) as trigger signal. By including 
a certain amount of 'pre-trigger' data in the time series, the slow response of 
the detector will not lead to any loss of information. In cases where no accurate 
triggering is possible with the DBFO signal, we apply a trigger based on the 
MfRN signal. „ ^ , i, 

All measurement results, from which we will present only a small portion, 
are retrieved from the oscilloscope and stored in Matlab data files, smce this 
program provides extensive capabilities for numerical data processing. By using 
the 'structure' data format not only the time series can be stored but m com- 
bination also various measurement settings and additional comments on the 
experiments are saved. This makes it easy to identify particular measurements 
when the number of experiments grows. The data for eadi measurement is 
stored in separate files that axe named following a strict convention. By using 
this conveniaon it is possible to automate the process of data retrieval for a lar^ 
number of acperiments. The benefits of this approach become even more dear 
in the next diaptor. 
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3 Defect classification 

Timely knowledge of the type of disc defect that is influencing the optical disc 
drive can help to improve servo performance and hence playability. This in- 
formation makes it possible to select or adjust control strategies and other 
countermeasures to eliminate influences of disc defects on the system. Since 
paxametric models of signals affected by disc defects are not yet available, es- 
timation methods like for instance a Kabnan filter, cannot be used to identify 
disc defects. 

Identification of disc defects by comparing new signals with a database of 
known defect signals resolves this problem as long as the database contains 
enough measurements. Given the enormous number of possible disc defects the 
feasibility of this method is limited by the available memory for the database and 
the speed of algorithms to search through the stored data. The size of a database 
with reference signals can be reduced by identifying a limited number of classes 
that each describe a large group of defect signals in the whole data set. These 
defect classes can also be used to design more specific defect detection strategies. 
Such detectors only have to detect particular groups of disc defects instead of 
having one detector looking for the occurrence of all possible defects. The whole 
disc defect classification process is schematically presented in Figure 3. 

3.1 Hierarchical clustering 

An important dixstering method is the hierarchy. This structiire will form the 
basis of the disc defect classification that we treat in more detail in Section 3.2. 
In general a hierarchy is a set Sn ^ {Sh : h s Ti} of subsets ShQX.h^H^ 
called clusters and satisfying the following conditions: 

1. X€5h; 

2. for any <Si , ^ 6 Su^ either they are non-overlapping (5i n 5*2 = 0) or one 
of them includes the other (divisive: Sx C ,S2 or agglomerative: 52 S 5i), 
all of which can be ejcpressed as 5i n52 e {0,5a,52}; 

3. for each ^ € 2", the corresponding singleton is a cluster, {%} € Sn- 

3.2 Disc defect clustering 

As mentioned we want to classify disc defects and use the results to identify new 
disc defects by comparing them with the known reference defects derived firom 
the classification. This approach relaxes the need for accurate physical models 
of disc defect and their influence on measurable signals. In addition it also 
drastically reduces the amoimt of data that is needed for defect identification. 
Finally the approach of classification also provides us with the required selection 
criteria to identify defects. 

We intend to use measurements of servo signals that are affected by various 
disc defects for this purpose. Therefore, according to the definitions in the pre- 
vious paragraph, we actually need to develop a clustering algorithm. Prom the 
resulting arrangement of disc defect measurements in distinct groups, we then 
can derive a formal disc defect classification. Note that these classes themselves 
are part of a larger classification that covers all possible disturbances in optical 
disc drives. 
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Figure 3: Disc defect classification flowchart. 
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Figure 4: Anti-causal, zero-phase filtering process. 
3.2.1 Signal processing 

In the subsequent analysis we primarily focus on measurements of the MIRN 
signal. The physical imderstanding on how this signal is influenced by disc de- 
fects helps in the development and the validation of a clustering algorithm. The 
unreliabUity of the RJBN and PEN signals and the lack of physical understanding 
of their behavior during disc defects, further justifies the focus on the MIRN 
signal. In Section 3.3 we briefiy come back to this subject. 

We expect the MIRN signal to be locally 'constant' when there are no se- 
vere disturbances such as shocks or disc defects present. This assumption can 
be confirmed by examining the measurement results presented in CJhapter 2. 
Those graphs however also show that the measurements are corrupted with 
noise. It is assumed that the observed noise is a combination of measurement 
noise, internally generated system noise and quantization noise caused by the 
analogue-to-digital conversion in the measuring device. In order to remove this 
noise, which can obscure the phenomena we are interested in, we start by filter- 
ing the measurements. 

Far the filtering of the time series we use aii anti-causal, zero-phase digital 
filter implementation. The benefit of this type of filtering is that it msJces it 
possible to eliminate all phase shifts introduced by the filtering. By further 
choosing a gain at low firequencies equal to one, the exact shape of the signal 
is preserved. The general processing scheme for this filter implementation is 
depicted in Figure 4. The reversing of the filtered time series and filtering this 
sequence again removes all phase shifts introduced in the first filter passage. 
Note that the time reversing of the filtered sequence also introduces the non- 
causality mentioned. Hence this type of zero-phase filtering is only possible 
with time series that are completely available off line. Hence this method to 
remove measurement noise can not be used in an online clustering algorithm. 
Figure 5 shows the result of the whole filtering process on a particular defect 
measurement. 

FVom the measurement results presented in Chapter 2 we also note that, 
for most disc defects, the largest part of the time series is unaffected. Since 
we are only interested in the part of the MIRN signal that deviates firom its 
normal 'constant' level, these constant parts of the measurement are removed. 
This is done by selecting the region of interest for each measurement hy hand. 
Not enough information is available to exactly define the begin and end of the 
affected re^ons automatically. However with some physical insight, objective 
inspection of the data and common sense the selection can be made relatively 
accurate as shown in Figure 6. This division between the normal and distmrbed 
MIRN behavior also offers us the possibility to determine the DC offset of the 
measured signals and remove it firom the time series. This is done by determining 
the average value of the normal signal regions and subtracting this ofifeet value 
firom the complete time series. 
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Figure 6: Selecting be^pn and end of the affected region in a measurement. 
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3.2.2 Describing defect signals 

Prom literature it appears that the clustering method depends on the kind of 
input data that is available. In our case we want to cluster a set of time series. 
However we believe that it is not wise to use these time series directly. First of 
all the length of the time series depends heavily on the type of disc defect and 
the exact location of the laser spot on the disc. Moreover the measurements are 
still, despite the filtering, corrupted with noise. Therefore we need to describe 
the signal with a set of characteristic parameters that are insensitive to those 
variations. Preferably this set is diosen small in order to limit the computational 
efforts. 

A good set of characteristic variables would be the parameters of a model 
that describes the disc defect signal. We already pointed out that no suitable 
parametric models are available yet. Using a black-box identification method 
could resolve this problem. However, to choose the right identification method in 
order to get a set of robust parameters, still quit some insight and knowledge of 
the system is required. Another possibility is to describe the signals with a set of 
statistical quantities. The non-stationary^ character of the signals imder study 
complicates this approach. More importantly, the assumption of randomness 
interferes with our previous statement that disc defect signals in essence are of 
a deterministic nature. 

Tb overcome these problems we approach the problem in a more intuitive 
manner. By combining various concepts from the theories discussed above with 
the insights gained on disc defect signals, a suitable set of characteristic signal 
properties is found. Note that from now on we assume that the DC offset of 
the measured MIRN signals is removed as we discussed in Section 3.2.1. This 
implies that a MIRN level around zero corresponds to a normal light reflection 
and higher or lower MIRN values indicate an increased or decreased reflection 
relative to the normal situation. Furthermore we only consider the affected 
region that we previously selected by hand from the whole time series. 

. The. first characteristic property is the mean value of the disc defect sig- 
nal. This value is particiilarly useful to distinguish between defects that have a 
higher and lower reflectivity than the normal disc. For a black dot with a lower 
reflectivity than the normal disc, the mean value will be below zero while it will 
be positive for a white dot. The second characteristic parameter is the (Zvm- 
Hon that, more conveniently, can be expressed through the nvunber of measured 
samples when the sample time of the measuring device is known. The third 
property is the peak value of the disc defect signal. Tb maJce a fair comparison 
between values for all disc defects, the absolute peak value is taken. Otherwise 
the peak value for a white dot would always be higher than it would be for a 
black dot. This is undesirable since we are UMunly interested in the different be- 
havior of signals compared to the normal situation. Finally we divide the signal 
into a fixed number of amplitude hands and count the number of samples that 
fall within each band. See also Figure 7. The resulting values for each ampli- 
tude band complete our set of characteristic parameters. It is not likely that all 
these signal properties yield a value in the same order of ma^tude* Therefore 
we add weighting factors to all parameters in order to obtain a balanced set of 
signal properties. 

random process whose stotistical properties are invariant in time is called stationary. 
Clearly this is not the case for the defect signals under consideration. 
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Figure 7: Properties used to describe the signals of interest. 



Formally we can summaiize the results of this section as follows. Let Z 
denote the set of all possible time series for the MIRN signal. The subset 
C X is formed by L time series of length JV, representing all MIKN signals 
that are aifected by disc defects and disc defects only. This subset contains all 
dimensional vectors y^ = (2/ri, 2/r2, . . • , Vtn), representing the affected MIRN 
time series in X. The index r can be seen as a label that uniquely identifies the 
corresponding element i G Z. Next pi, P2, » • • . Pm span a space B in R"*, where 
p^. J- = 1^ 2, . . . , m are the m different descriptive properties for the elements 
inZ. We can now define the mapping :F : clX-^B that maps the time 
series of interest (domain) to a set of descriptive signal properties (range). This 
mapping can be further specified by defining the functions : (ZX --^ pj, 
J = 1, 2, . . . , m. Rrom the selected set of characteristics we <Hscussed earlier, 
these functions are defined as follows. 



MVr) 

/2(yr) 
MVr) 



1 ^ 



hiVr) = W4 



W3'mBx\yr{k)l fc = l,2,...,iV 

^ f 1, (m - 4)Ay < \yrm < (m - 3)Ay 
2.^ \ 0, (m - 4) Ay > \yr(k)\ V (m - 3) Aj/ < \yr{k)\ 



0) 

(2) 
(3) 

(4) 



N 

•E 

k 



m 



-4)Ay<|yr(A:)| 

-4)Al/>|2/r(fc)| 



(5) 



where k is the runnimg variaJble representing the sample instant, N is the total 
number of samples in the afiiected re^on of the signal and the index r indicates 
on which element ol A'^ C. X the mapping is performed. The number of de- 
scriptive properties m is defined by the number of amplitude bands that is used 
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Figure 8: Geometric interpretation of signal mapping aad clustering for n s= 2. 

in the discrete amplitude distribution. See (4) and (5). Wi, Wb, . . . , Wm are 
the weighting factors that can be used to balance the signal mapping. 

3.2.3 Clustering algorithm 

With the mappings presented in (1) to (5) L different rivdimensional property 
vectors p^ = (fiivr), l2{Vr), • • • , fm{yr))7 T = 1, 2, . . , , L can be constructed. 
These row vectors can also be interpreted as unique points in representing 
the corresponding MIRN time series in R"*. In general, clustering is the process 
of distinguishing separate groups of similar data. In analogy to the geomet- 
ric interpretation of the MIRN signal mapping, clustering can be seen as the 
identification of different groups of closely spaced data points. This is depicted 
graphically in Figure 8 for the two-dimensional case. For the various clusters 
^2 and 53 in Figure 8) it holds that the geometrical distances between 
points withia one cluster are much smaller than the distances between points 
belonging to separate clusters. 

A clustering method that directly iises this geometric interpretation of sim- 
ilarity is agglomerative hierarchical clustering. The input for this clustering 
method is a so-called dissimilarity entity-to-entity matrix, where each entity is 
considered as a single cluster or singleton, denoted by S^^ h^H. Note that 7i 
is the set of all cluster labels and that each h is uniquely related to one dus- 
ter. For an agglomerative hierarchical clustering of L objects, the set H holds 
2Xr — 1 labels, where the first L elements correspond to the original entities or 
singletons. The dissimilarity matrix can easily be derived from the mapped data 
points by calciilating the distance between every pair of objects in the data set. 
Erom lit^ature various definitions for vector distances are available, 

Euclidean distance 

<*r, = )/(p.-Pr)(p.-PrF (6) 
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City Block distance 



Minkowski metric 



1/p 

(8) 



A more general notation. When p = 1 it represents the City Block dis- 
tance, and when p = 2, this metric is equal to the the Euclidean distance. 

The indices r and s denote the labels for the corresponding clusters. The most 
widely used distance measure is the Euclidian distance, tisually denoted as 
||P3 - p^||. Also note the similarity of the above distance measures with the 
concept of vector norms. With the above distance measures a dissimilarity 
matrix D = [drs] with r, s = 1, 2. . . . , L, can be constructed. Note that D is 
symmetric and the dements of its main diagonal are zero. With the dissimilarity 
matrix available the main steps of the algorithm are as follows. 

Step 1 Find the minhnal value d(r*,5*), r* ^ s* in the dissimilariiy matrix, 
and form the merged cluster Sh = ^r* U Ss* , /i 6 

Step 2 IVansform the dissimilarity matrix by substituting one new row (and 
column) h for the rows and columns r*, s*, with its dissimilarities defined 

^ cZ(r, s) = F{{Srh {^sh ^r, Is) (9) 

with r, 5 € {1, 2, . . . , /i} n {r* , s*y . i?' is a fixed dissimilarity function and 
Ir^ Is define the number of objects in cluster Sr and 5« respectively. If the 
mmiber of clusters obtained is larger than 2, go to Step 1, else End. 

The function F defines the dissimilarity between the merged clusters. Since 
these dusters can contain more than one object, the distance meastures,' as 
defined in (6) to (8), cannot be used here. Several popular methods to define 
the inter-cluster distance or dissimilarity are presented below. 

Nearest neighbor (Single linkage) uses the smailest distance between ob- 
jects in the two clusters 6*^ and 

d(r, s) = min ||p«^ - Pnll, i € (1, . . . , Zr), i € (1, . . . , h) (10) 

I^thest neighbor (Complete linkage) uses the largest distance between 
objects in the two dusters. 

cl(r, s) = max Hp^^ - p^ll, t 6 (1, • . . ,Zr), i e (1, . . . , h) (H) 

Average linkage uses the average distance between all pairs of objects in the 
two dust^ Sr and Sa- 

= 77- EE llP-i-Pnll (12) 
i=i j=i 
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1 3 4 2 5 6 

Figure 9: Graphical representation of a hierarchical cluster tree. 



Centroid linkage uses the distance between the centroids of the two groui)S 
Sf and Sg, . . 

d(r,«) = ||Pa-Prll (13) 

where: 



1 



(14) 



and is defined similarly. 



Ward linkage uses the incremental sum of squares; that is, the increase in the 
total within-group sum of squares as a result of merging clusters Sr and 

d(r,.) = W,#^ 



lr-\-h 



(15) 



where £^(r, s) is the squared distance between clusters Sr and Sg defined 
in the Centroid linkage by (13). 

The results of the agglomerative hierarchical clustering method can be rep- 
resented graphically as a tree. An example of such a hierarchical cluster tree 
or dendrogram is shown in Figure 9. In such a graph the nxxmbers at the hori- 
zontal axis represent the indices of the original singletons and they are called 
leaf nodes. The links between the objects are represented by the connecting hor- 
izontal lines, called interior nodes. The height of the vertical link lines indicate 
the distance between the linked objects. Note that disproportionately long ver- 
tical lines can indicate that the corresponding objects are combined incorrectly. 
With this graphical representation of the cluster tree we can easily define an 
arbitrary niunber of clusters C by drawing a horizontal line in the dendrogram. 
All the leaf node&— representing the entities — ^that are connected below this line 
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belong to one particvdar cluster c € C with C = {1, 2, . . . , C}. For the example 
shown in Figiire 9, three clusters can be defined by drawing a dividing line such 
that it only bisects three vertical lines in the tree. This results in the clusters 
Si = {1, 3, 4}, S2 ^ {2, 5} and the singleton Ss = 6. 

The presented linlcage methods will all give the same or almost the same 
results, when applied to well-structured data. When the structure of the data is 
somewhat hidden or complicated, the methods may give quit different results. 
In the latter case the single and complete linliage methods represent the two ex- 
tremes of the generaUy accepted reqxiirement that the ^natural' clusters must be 
internally cohesive and, simultaneously, isolated from the other clusters. Single 
linkage clusters are isolated but can have a very complex chained and noncohe- 
sive shape. In contrast the complete linkage clusters are very cohesive, but may 
not be isolated at all. The other three methods result in a trade-off between 
cohesivmess and isolation of the resulting clusters. 

3.3 Cluster restdts and validation 

In this section we present the results of the disc defect clustering, obtamed with 
the agglomerative, hierarchical algorithm. Prom the various options presented 
in Section 3.2.3 we select the Euclidean distance measure and Ward linkage 
method. The Euclidean distance measure is selected since it is easy to calculate 
and its geometrical interpretation is straightforward. The Ward linkage method 
is chosen since it provides a good trade-off between cluster cohesiveness and 
isolation. When compared to similar methods (average and centroid linkage) the 
Ward linkage appears to result in the most logical clustering based on analysis 
of the corresponding defect signals and their physical interpretation. 

The results of the clustering process for the set of reference measurements is 
shown graphically in Figiu-e 10. In the dendrogram we observe two inconsistent 
linlcs, denoted by A and B. The first one links the artificial scratch of 1320 
to, among others, the quarter black dots while we would expect them to be 
linked to the 1120 and. 1420 fim artificial scratches. Figure 11 reveals the cause 
• of this mconsistency. It appears that the MIRN mgnal has been normalized 
differently during the measurements with the 1320 fim scratch. After removing 
the DC offeet and comparing different scratch measm-ements, the lowest MIRN 
level will be higher for the 1320 pm scratch. Prom this one could conclude 
that this particular scratch reflects more light than others, which is not true 
in reality. This MIRN level difference is however adequately translated by the 
signal mapping and hence this particular scratch is placed in another cluster. 
During new measurements the phenomenon kept reappearing so most likely the 
cause lies in a local variation of the reflection of the test disc. In our further 
analysis we will carefully check wether this phenomenon influences the clustering 
results. 

The second inconsistency we observe in the dendrogram is related to the 
different fingerprints in our defect database. The clustering algorithm assigns 
a disproportionally large difference to the normal fingerprints and the heavy 
fingerprint, indicated by the relatively long vertical lines below B. Although 
the two types of fingerprints clearly differ, these differences are considered to 
be less significant than, for example, those between a small edge black dot and 
a large radial scratch. See also Figure 12. The cause of this inconsistency 
lies in the different ampUtude drops of the MIRN signal for normal and heavy 
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Figure 10: Dendrogram of the disc defisct clustering; Buclidean distance} Ward 
linkage, Wm = 1, m = 1, 2, . . . , 14. 
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Figure 11: Different MIRN normalization for 1320 fim artificial scratch mea- 
surements. 
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(a) Normal and heavy Bngerprints 



(b) Black dot edge and radial scratdi 



Figure 12: Compaxing normal and heavy fingerprints, edge Wack dots and radial 
scratches. 

fingerprints respectively. Due to the long duration common to fingerprints, 
a high number of samples will fall inside one particular amplitude band. The 
ampUtude difference for normal and heavy fingerprints causes this high property 
value to appear at a different column in the properly vector. The effect of 
this on the distance between objects is shown graphically in Figure 13. In 
order to reduce the mentioned ^consistency we must make the property vector 
more robust for small ampUtude variations. This can be done by adjustmg the 
weighting factor in (2). The value for the waghting factor is deternuned by 
trial and error. The resulting value Wa = 5 yields a more balanced mapping 
and hence the clustering results are more in Une with our interpretation of the 

defect signals. j u-x j 

Another clustering 'mistake' is the combination of black dots and white dots 
in one cluster. Based on visual inspection and pl^ical mterpretation of the 
corresponding defect signals, we would place these defects in separate dusters, 
dnce these two types of defects axe more or less oppoates. The obvious differen<» 
between the two types of defects is shown again in Figure 14. Although this 
phenomenon does not result in inconsistent Unks in the dendrogram it is closely 
related to the fingerprint case we discussed. The only signal properly suitable 
for makmg a distmction between higher and lower reflection is the signal mean 
value from (1). The other properties are aU based on the absolute MIRN signal 
and hence give sunilar results for both black dots and white dots as can be seen 
torn Figure 14. By ac^usting the weighting factor in (1), a better distmction 
between blade dots and white dots is achieved. By trial and error the weightang 
factor in Wi-/i(Vr) is determined, yielding Wi= 1-10*. 

The clustering results from the adjusted algorithm are depicted graphically 
in Figure 15. Comparing this result with the previous dendrogram in Figure 10 
shows that the added weighting factors mdeed reduce the inconsistency signifi- 
cantly. TWile 1 summarizes which disc defects are grouped together m the dif- 
ferent clusters that are selected in the dendrogram. From this table it becom^ 
dear that with the adjusted algorithm also the 1320 m"* scratch ia clustered 
according to our expectatfons. As we mentioned before we can easily alter the 



19 



PHNL020789EPP 

20 16.08.2002 




(a) Sensitive vectors (3, 15} and 
(15, 3) 



p4 




Cb) Robust vectors (5, 8) and (8, 5) 



Figiire 13: Dissimilarities caused by unbalanced mapping. 
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Figure 14: Different signal behavior for black dots and white dots. 
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Figure 15: Consistent dendrogram of the disc defect clustering; seven clusters 
selected, Euclidean distance, Waxd linkage, Wi = l' 10^, W2 = 5. 



Table 1: Objects in disc defect clusters. 


Cluster 


Objects 


Sx 


middle black dot 700 fjum 




middle black dot 900 fim 




scratch 420-820 fim 


Si 


middle black dot 1100 




scratdi 920-1120 fim 


Sb 


scratch 1320-1620 fim 




edge black dot 700-900 fMm 




quarter black dot 700-900 fim 




scratch at R = 32 mm 




scratch 320 fJtm 




scratch at R = 32 mm 




scratch at R = 35 mm 


Se 


all white dots 


Sr 


all fingerprints 
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Table 2: Combining disc defect clusters. 



Number of clusters Combined clusters 



6 


S1VS2 


5 


S1US2OSZ 


4 


S1US2USZ 






3 








2 









number of clusters by shifting the bisecting line up or down in the dendrogram. 
Table 2 shows which clusters are combined when a smaller number of clusters 
is selected. Both from Table 2 and the dendrogram in Figure 15 the existing 
hierarchy in the disc defect database becomes dear. Actually two major clusters 
can be identified. One with all middle black dots and artificial scratches and the 
other holding all the other defects. In both these large groups a further subdivi- 
sion of disc defect typ^ can be made, which is clearly depicted by the step-like 
linkage in the right and left parts of the dendrogram. Close examination of the 
various combinations of disc defects shows that this clustering is according to 
our expectations, based on physical interpretation of the diJOFerent disc defects. 

The measured MIRN signals, corr^onding to the disc defects in each clus- 
ter are shown in Figure 16. Note that the correlation between all signals in 
the same cluster is optimized to prevent the differences in used pre-trigger time 
and defect duration j6:om cluttering the general view. Figures 17 and 18 show 
the corresponding radial and focus error signals respectively. In these figures 
we observe that the REN and FEN dgnals also reveal some characteristic dis- 
tinctions between the various clusters. The dissimilarities however are far less 
distinct than those for the MIRN signals. Applying the clustering algorithm to 
the REN and FEN signal therefore does not give satisfactory results. However 
the results make us believe that with a different or extended set of mapping 
functions, specifically chosen for the REN and/or FEN signals, better cltister- 
ing results can be obtained with the REN and FEN signals. Another possibility 
that could lead to even better and distinctive disc defect clusters is to use a 
combination of the mappings for the MIRN, REN and FEN signals. 
3.4 Cluster modelling 

New the disc defect dusters are available, one step of the classification process 
remains. The clusters themselves are nothing more than distinctive groups of 
^milar defect signals. The data reduction we strive for is however not yet 
achieved. In this section we discuss how we can obtain an adequate description 
for each cluster that answers to this classification objective. 

The most suitable class description is that of a signal or model fi'om which 
such a dgnal can be derived. Not only is it easy to compare graphs of different 
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Figure 17: REN signals for clustered disc defects. 
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Figiare 18: FEN signals for clustered disc defects- 
signals qualitatively but a signal, or better a time series, can be used directly 
in mathematical computations. A class description in words for example, lacks 
this possibility. The problem now is to derive a representative signal (or model) 
that adequately describes all the signals belonging to one cluster. Inevitably a 
trade-off must be made between the accuracy of the description for individual 
" signals and itis general validity for the whole cluster. 

A straightforward method for this task is to fit a function to the time series 
in the cluster (see Figure 16) that approximates the data according to some 
criterion. The key issues for this approach are the choice for a general form 
of the function and the selection of a suitable criterion. The most widely used 
criterion is the sum of the squares of the errors between the fitted function 
and the data points. Methods using this criterion are usually denoted as least 
squares (LS) methods. 

Preferably the function or model structure is based on (physical) laws that 
relate the signals to the system that generates them. When such a structure is 
imavailable a more general structure must be used. Examples of such general 
function structures are the Fourier and Prony decomposition that approximate 
the data with a sum of sinusoidal or complex exponential functions respectively. 
Other possibilities are to approximate the data with polynomials or splines. 



3.4.1 Least squares polynomial fitting 

To obtain descriptive signals for each disc defect cluster we will apply a least 
squares polynomial fitting method. We choose a polynomial to approximate the 
defect signals in the first place because we lack a parametric function structure 
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that is based on a disc defect model. Fourier and Prony decomposition are not 
usable since the disc defect signals hardly show any periodic behavior, except 
for the small oscillations we observed in the MIRN signals during the passage of 
a fingerprint. A good alternative would be to use splines for the approximation 
when a spline fitting algorithm is available that can deal with several signals 
simultaneously. 

The results of the fitting with a polynomial of degree n 15 are shown 
in Figure 19. From this figure we conclude that the time series obtamed firom 
the fitted polynomial functions describe the disc defect classes reasonably well. 
However, due to the nature of the fitted function, we observe some small oscil- 
lations in the resulting signal that axe not present in the original time series. 
Especially at the edges of the defect signal these deviations can become signifi- 
cant when we tend to use the defect class signals for detection. For that purpose 
the begin and end regions of the signal must be known as accurately as possible. 
Applying a fitting routine that uses splines could resolve this since splines offer 
the possibility to impose demands on the slope of the fitted wgnal in regions 
where additional accuracy is desired. 

3.4.2 Glass validation 

Now the classification of our set of reference disc defects is complete we can 
perform some tests in order to validate the method. This is done by applying 
the property mapping from Section 3.2.2 to the derived class signals and do the 
same for some well-known, new defect measurements. Then we calculate the 
distance between the property vector of the test measurement and those of the 
various defect classes, using the Euclidean distance as we did in the clustering 
algorithm. The cluster to which the test measurement belongs should yield 
the smallest distance value. An extra check is performed by calculating the 
correlation between the test signal and the various class signals. This time the 
correct class must result in the highest correlation coefiicient. The signals we 
vised for these tests are shown in Figure 20 and the res\ilting distances and 
correlation coeffidents are summarized in Tables 3 and 41 By comparing these 
values with our expectations based on visual inspection of the test signals, we 
conclude that the classification process performs well. 
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Figure 19: Multivariate 15*^ order polynomial fit for. clustered disc defect sig- 
nals. 



A: middle blade dot 700 (im 



B: edge UacK dot 1 100 \tm 




"0 0.2 0.4 0.6 0.8 1 
C: artlfldal scratch 1420 \an ^ ^q-j 





1.2 
1 
1 

0.9 
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Figure 20: Various test measurements used for class validation. 
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"X^ble 3: Euclidean distance between property vectors of class rignals and test 
measurements A, .... F from Figure 20. 



Class 


A 


B 


Signal 
C 


D 


E 


F 




370 


906 


373 


1280 


1214 


916 




604 


1250 


214 


1592 


1164 


1259 


ST 


839 


258 


1108 


548 


1363 


244 




503 


439 


842 


877 


1410 


452 




1400 


759 


1509 


445 


1195 


731 


3? 


1721 


1732 


1388 


1689 


179 


1722 



l^ble 4: Correlation betrween class signals and test measurements. 



Class 


A 


B 


Signal 
C 


D 


E 


P 




0,083 


0,850 


0,854 


0,274 


0,332 


0,738 




0,768 


0,605 


0,999 


0,364 


0,444 


0,538 


Si, 


0,896 


0,998 


0,603 


0,192 


0,202 


0,968 




0,945 


0,983 


0,654 


0,208 


0,277 


0,934 


Se 


0,279 


0,220 


0,387 


0,985 


0,388 


0,304 




0,627 


0,621 


0,863 


0,494 


0,815 


0,408 
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Figure 21: General structure of the detection problem. 

4 Defect detection 

Various control strategies are available in order to improve playability with re- 
spect to disc defects. The succes of all these techniques depends on the ability 
to detect those specific disturbances in time to take the required countermea^ 
sures. When information on the type of defect is available it further becomes 
possible to select the most suitable strategy. This detection and, closely related, 
identification of disc defects are the subjects of this chapter. 

4.1 The detection problem 
4.1.1 CoxLcept and general structure 

The structure of the detection problem which we will deal with is the following. 
Given a signal record (yi, 2/2, - . , decide which of the two hypotheses Ho 
or jETi is true: 

^0 • (2/i> 2/2, • • • , Vn) follows the model 5^,0 
Hi : there exists a time instant r, 2 < r < n, such that 
• • • > J/Jfed-i) follows the model 5^,o 
(vh^> - • , 2/n) follows the model S$^i 

Here 5^ is a family of models parameterized by the vector 9. See also Figure 21. 
These models and the signal record together form the 'source' block in Fig- 
ure 21. The transition mechanism maps the hypotheses for a given source into 
an observation space. This mapping follows from the criteria to which the de- 
tection must comply. The selection of the valid hypothesis is done by applying 
a decision rule to the mapping result. 

Note the subtle difference of the above signal record with the mentioned time 
series in Chapter 3. When we would choose the record length n equal to that of 
the whole time series iV, the detection structure would prohibit the search for 
multiple detections between 0 and JV. By choosing the size of the observation 
window n < iV, a sequential search for individual detections can be performed 
in each part (yjb, y^j+ii - • > 2/Jb+n)i A? = 0, n + 1, 2n + 1, . . - , TV of a complete 
time series. Note that in this diapter the symbol i represents an index instead 
of an element of a set. 

Bbr online detection it is important to realize that we are always dealing 
with a causal system. This implies that it is impossible to detect an anomaly 
precisely at the moment that it occturs. Some delay At is inherently present 
between the detection at t = fc^ and the actual occurrence at i = Jb^ — At of the 
anomaly. The goal of a detection system now is to detect a change as quickly as 
possible after it has occurred, in order that, at each time instant, at most one 
change has to be detected between the previous detection and the current time 
point n. 
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(a) MuHiple filter structure P>) Itesidual filter structure 

Figure 22; General structures for the detection of signal changes. 



Finally we remark the similarity between the detection problem and clas- 
sification or identification. In both cases the behavior of a dynamic system, 
represented by a signal, is compared with known types of behavior. Based 
on this comparison a decision is made according to some rules. Pbr detection 
the dedsion is whether there is an anomaly present or not while in the case 
of classification or identification we dedde to which class the behavior (signal) 

belongs. n j i 

The generalized form of the above detection structure is the so-called mul- 
tiple filter structure as depicted in Figure 22a. The observations y{k) with 
A? = 1, 2, . . . , JV, are processed by a bank of filters, each of which is based on a 
particular hypothesis. For instance in the mentioned structure Filter 1, a^oci- 
ated with Hq, assumes no change has occurred and Filter 2, which is associated 
with Hi assumes that a particular type of change has occtirred at a certain time 
instant. The outputs of the filters, 7, represent signals that should typically be 
small if the corre^onding hypotheses axe in fact correct. The decision mechar 
nism is therefore in essence based on determining which of the filters is doing 
the 'best* job of keeping the corresponding 7's small. 

Another general structure for the detection of abrupt changes is the residual- 
based structure, also illustrated in Figure 22b. In this case a filter is designed 
based on the assumption that no abrupt change has occurred or will occur. The 
filter produces a prediction y of the observed signal y, based on this assumption 
and the history of the observed signal. This prediction is subtracted firom the 
actual signal to produce a residual signal 7. If no abrupt change has occurred, 
7 should be small. Consequently deviations firom this behavior axe indicative 
for anomalies, and it is on this fact that the decision mechanism is based. 

4.1.2 Defect detection and identifilcation 

In the previous section we mentioned three cases in which the need for detection 
of abrupt changes arise. For disc defect detection in optical disc drives both the 
second and third reason apply. Since our goal is to prevent that erroneous 
information will influence the operation of the system, disc defect detection 
can be seen as an alarm that initialisses appropriate countermeasures. Known 
countermeasures all adjust or completely replace the servo controller in one way 
or the other, effectively adjusting the tracking mechanism of the read out unit- 
This is in line with the third reason we mentioned. 

From the goal we stated, various requirements can be derived which a disc 



29 



PHNL020789EPP 




30 



16.08.2002 



defect detection mechanism must meet. In the first place the detection time 
should be as short as possible. Second we require the highest possible accuracy 
or reliability. It will be clear that those requirements are by nature conflict- 
ing. The search for an optimal detection algorithm with respect to speed and 
accuracy is further complicated by the robustness demands. Despite the vast 
amount of different disc defects and the lack of information on other influencing 
parameters (disc reflection, temperature, substrate thickness e.g.) the detection 
algorithm must satisfy the first two requirements. 

Next to timely detection of a disc defect, we also want to know the type 
of disc defect. With the general multiple filter structiure from Figure 22a both 
the detection and identification of disc defects can be combined. This can be 
achieved by formulating a hypothesis for each filter that assumes the presence of 
one of the defect types that we wish to distinguish. The same can be achieved 
with the residual filter structure when several of these filters are used in paral- 
lel. The resixlting defect classes 5©, c € C of a classification lilce we discussed in 
Chapter 3, can form a good starting point in formulating the required hypothe- 
ses. 

Decoupling of the detection and identification of disc defects is also possible. 
This implies that we need one algorithm that is able to detect all different types 
of disc defects. This relaxes the need for fast defect identification and hence 
its accuracy can be improved. However the chance of false alarms during the 
defect detection increases. Since the detector must be able to detect all defects 
its resolution will be reduced. This malces it harder to distinguish disc defects 
from other signal distortions. However when the countermeasures initiated by a 
defect detection do not endanger the proper functioning of the drive in case of a 
false alarm, the decreased reliability of the detector becomes of less importance, 

4*2 Disc defect detection method 

Independent from the type of detection method and structure, the goal of disc 
defect detection implies the need for an on-line algorithm. Off-line detectioxi 
is not feasible due to the stringent demands on startup times for optical disc 
drives. There is simply no time to scan the whole disc and memorize the location 
of all disc defects ^off-line' before the actual data read out starts. However for 
the selection of affected signal regions as mentioned in Chapter 3, an off-line 
detection algorithm would suffice. 

Various on-line detection methods that fit one of the general forms of Fig- 
ure 22 have been developed. In this section we present a disc defect detection 
method that is based on the well-known concept of miaximum likelihood (ML). 
The corresponding theory is treated extensively in literature and already work 
is done on implementing detectors of this kind in radar and hard-disk drive ap- 
plications. The method leads to a detector that is easy implementable on-line 
and that is closely related to defect classification. 

4*2.1 Maximum likelihood detection 

The method we will discuss here uses the MIRN signal as input. The task of the 
detector will be to detect whether the influence of a defect, represented by the 
set of defect classes <Sc, is present in this signal. For the sake of clarity we only 
consider two possibilities. The first is that no defect is present and the second is 
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that a defect, represented by 5c, is present. As discussed in the previous section 
the detector can easily be extended to cover all defect t^ypes by \ising several 
detectors in paralleL 

We start by defining the two corresponding hypotheses Hq and Hi for the 
disc defect detection problem. The null hypothesis Ho states that no disc defect 
is present and Hi is true when a disc defect is present. The observations of the 
MIRN signal under the two hypotheses are: 

Ho : Vits-^^k) - ynits + k) (16) 
Hi : y(t^ + ifc) = ?/n(*a + A:) + yc(fc) (17) 

with i = 1, 2, . . . , JV determining the detection window y{ts) = (y(*s + 1), 
y{ts + 2), . . . , y{ts + N)). The MIRN signal is modelled as a stochastic process 
y(i) = ^(t) + v(t) with a purely deterministic part pt(t) and a stochastic part 
v(t). The signal yc{k) denotes the defect signal obtained from 5c, and *a is 
the defect arrival time. The observations of the MIRN signal and the defect 
signal are jointly represented as the source in Figure 21, Attached to the two 
hypotheses are the two conditional probability densities Py(*,)|Ho(yl^o) and 
Py{t,)\Hi{y\Hi)' They define the chance on respectively Ho and Hi, given the 
actual observations of y^. 

In order to determine which of the two hypotheses is true a decision rule is 
needed. The requirement for such a rule is that it maodmizes the reliability of 
the decision for a given detection time. Stated differently it must minimize the 
detection time for a given level of reliability. We now assume that the diance 
of a false alarm^ and that of a missed detection^ are directly related to the 
detection time or, with a given sample time, the size of the detection window. 
In that situation the likelihood ratio test yidds an optimal decision rule with 
respect to those critaia. It is defined as: 

Py(t,)|Hi(y|Hi) ^ ^^gj 

Py(t,)|Ho(y|Ho) 

where Hi is accepted when the ratio in the left-hand side of (18) is greater than 
the threshold f?. Else Hi is rejected, mdicating that no defect is detected. The 
likelihood ratio forms the probabilistic transition mechanism while the threshold 
comparison is the decision rule according to Figure 21. 

For simplicity we now assume that the normal, imaffected MIRN signal is an 
uncorrelated, zero-mean stochastic process (Gaussian white noise) with variance 
A. In that case the likelihood ratio test for the presence of a disc defect is: 

f^Vits 4- k)y^{k) <'^^ i EyKfc) + A . In77 (19) 

which can be written in the form of a simple discrete time FIR-filter. The 
detector then becomes: 

Yc{N - k) {z-^ . Y{z)) <'^« TH (20) 

^Rejecting Ho when it is true; type I error 
^Accepting ^o when it is false; type n error 
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where TH denotes a new threshold value. 

The assumption that the unaffected MIRN signal can be described by Gaus- 
sian white noise is not a very realistic one. A more realistic representation can 
be obtained by incorporating the coloring of the noise for the quasi-stationaxy 
MIRN signal. The required changes in the FIBrfilter of (20) are specified in 
literature. However, for reasons of simplicity, we continue to use the white noise 
assumption in the remainder of this chapter. 

The choice of the threshold value TH and the detection window size N de- 
pend on the requirements of detection speed and reliability. These requirements 
on their turn depend on other elements of the optical disc drive such as the iised 
control strategy during disc defects, the data decoding and error correction algo- 
rithms. More research and experimental validation are needed on the integrated 
system in order to determine suitable values for these important parameters. 

4.2.2 Some intuitive adjustments 

The FIR-filter from (20) forms the core of the maximum likelihood detector. 
Basically the output of this filter is a multiplication of N samples of the input 
signal with N corresponding samples of the ^assximed defect reference signal. 
We already suggested to use the time series Sc as reference signals ydk) in c 
parallel detection filters. In Chapter 3 we mentioned that the edges of these 
time series, obtained from fitted polynomials, are not very accurate. Another 
disadvantage of these signals is that they hardly show any deviations from the 
normal MIRN signal during the fixst few samples. When using these signals in 
the suggested detector this will result, when N is chosen small as required by 
the demands for fast detection, in an output that will hardly show any change 
when a defect is present. Therefore we dedde to approximate the edges of the 
fitted defect signal with a straight line as depicted in Figure 23 for one of the 
reference signals. 

Ail the models are adjusted like we discussed and simulations are done with 
.the resulting FIR-filter. Fbr one of these simulations the resulting output of the 
filter is shown in Figure 24. We renoiark that this output is approximately 20 
times higher than for the initially used defect class models with the smooth edges 
(gray line in Figure 23). The axrtual detection talces place by searching for the 
time instant where one of the output signals reaches the threshold level TH, The 
figiu-e further shows that for all defect models yc{k) the filter gives a significant 
output. From this observation we conclude that the distinctive capabilities of 
the various filters (one for each defect class as identified in Chapter 3) is low. 
Further investigations show that, independent of the class to which a test signal 
belongs, the reference signal for U 52 results in the highest filter output and 
that for Sr in the lowest. 

From the above we conclude that a combined defect detection and iden- 
tification approach does not give satisfactory resijdts. More accurate models, 
especially at the starting edge of the affected signals and better knowledge of 
the noise coloring and probability distributions coxild possibly improve the per- 
formance. However we now chose another approadi. We observe that the slope 
of the reference signal ydk) in the detection window is the feature that deter- 
mines the amplitude of the FIR-filter response. In the first N samples it appears 
that the reference signal for the defect class <Si U<S2 has the steepest slope of all 
signals. See also Figure 24 where the output of the FIR-filter is shown for the 
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Figure 23: Straight line approximation for the reference signal edges. 
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Figure 24: FIR-filter output for defect class reference signals; 700 pm quarter 
black dot, N = 10. 
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Figiire 25: Reference signal with infinite slope for defect detector. 



different reference signals. 

The idea now is to model the defect with a signal that has an infinitely 
steep slope. The amplitude of this 'block form' defect model can be chosen at 
will, as long as the corresponding threshold value is adjusted accordingly. The 
new defect model is depicted in Figure 25. Simulation results with this new 
reference signal are shown in Figure 26. Here we cleaxly show the improvement 
in detection speed that can be achieved by using this abstract defect model. 
Note that when the amplitude of the block form defect signal is chosen equal to 
one, the output of the FIR-filter is reduced to a simple simunation of N samples 
of the incoming MIRN signal. 

4.2*3 Disc defect identification method 

The choice for the ^infinite slope* reference signal also implies that an extra 
algorithm is needed to identify the exact type of disc defect whenever one is de- 
tected. A logical choice is to use the MIRN signal mapping we developed for the 
defect clustering algorithm. As soon as a defect is detected by the defect filter, 
we start to construct the property vector p for the incoming MIRN signal. See 
also Section 3.2.2 and 3.2.3. Initially only N samples of the signal are available 
but for each new sample extra information becomes available and hence the esti- 
mates of the various properties become more accurate. For the reference signals 
4?c this procedure can be performed off-line, resulting in c property matrices or 
look-up tables, denoted by Pc- Bach row n, n = 1, 2, 3, . . . of such a matrix 
holds the property vector for the first + n — 1 samples of the corresponding 
reference signal. At each time instant k we now can calculate the Euclidean 
distance between the property vector p of the input signal and those of all the 
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Figure 26: FIRrfilter output for block form reference signal; 700 fjtm quarter 
black dot, N = 10. 

class reference signals Fc,n, cSC. When the number of available samples is suf- 
ficiently high, one of these distances will become significantly smaller, thereby 
identifying the occurring disc defect on-line. 

4.3 Detection performance 

In this section we present a simulation model of the defect detection and identi- 
fication algorithm that we discussed in the previous section. With this model we 
try to assess whether the intuitive design choices we made, resxilt in an improved 
defect detector. Next to the vaUdation of the method we also discuss various 
practical issues that are of importance for the integration of the algorithm in 
optical disc drives. 

4.3.1 Simulation results 

The simulation model of the defect detection and identification algorithm is 
made in Simulink. As much as possible we restrict ourselves to the use of 
elementary blocks like siunmation points, switches, comparators and unit delay 
blocks. This approach results in a basic model for which translation to a real 
hardware implementation is relatively easy. Details about the simulation model 
can be found in Appendix 6. 

In Figure 27 the simulation restilts axe shown for a quarter black dot of 
900 ^m. The graph shows that the new detector responds about 30 fis faster 
to the defect than the currently implemented detector. This improvement in 
detection speed goes at the cost of some 'false' alarms. As discussed in Sec- 
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Figure 27: ML detector simulation result for a 900 fim quarter black dot; N 
10, TH = 0.5. 



tion 4.2.1 an optimal threshold value can be determined that gives the best 
trade-off between detection time and false alarms. 

Experiments with other values for the detection window size N revealed 
no significant changes in the resulting outputs of the simulation model. The 
corresponding result of the implemented identification algorithm for the black 
dot simulation is shown in Figure 28. Erom this graph it becomes clear that 
in approximately 45 pbs the correct defect type — ^the class to which the defect 
belongs — can be determined. 



4.3.2 Validation 

Simulation results already showed that the maximum likelihood detector with 
the block &>vm reference sigxial performs equally good or better than the cur- 
rently used detector. However we also want to compare the results of the de- 
tector with the simple case in which defects are detected whenever the MIRN 
signal itself passes a certain threshold level. In order to do this we adjust the 
threshold levels for both methods so that no feJse alarms will occm: during the 
simulation. With these TH values we then compare the resulting detection times 
for the ML detector and the direct method. We perform simulations for several 
different test defects, where for each defect three different MIRN measurements 
are used. The results of these comparisons are summarized in Table 5. PVom 
this table we can conclude that the ML detector gives an output signal with a 
better signal-to-noise ratio. Therefore the threshold level can be set lower in 
this case, yielding faster detection. FVom the experiments we also conclude that 
in all cases the identification algorithm indicates the correct defect class for the 
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Figure 28: Defect identifier ^miilation result for a 900 /xm quarter black dot; N 
= 10, TH = 0.5, Euclidean distance measure. 

test measiirements, within, half of the total defect duration time. 
4.3.3 Implementation 

The simulation model is constructed in such a way that it can form the basis 
for the actual implementation of the algorithm. The most important issue that 
reqtures attention during implementation is the o&et cancellation, of the MIRN 
signal. During simulations we use MIEUSf signals from which the DC ofeet is 
removed. See also Section 3.2.1. Tests wth the simulation model show that 
the method is very sensitive to these oflfeets in the MIRN signal. As discussed 
in Section 3.2.1 the offset can be determined by calculating the average value 
of the MIRN signal when it is unaffected by any disturbances. In an on-line 
implementation the required mean value can be calculated from a fixed nimiber 
of imaffected samples and it can be updated repetitively. FVuiihermore a good 
initial offeet value must be available that, for instance, is determined during the 
drivers initialization sequence. 

Another issue that must be taken into consideration when implementing a 
detector in a (re-)writable drive is the laser power adjustment- When an optical 
disc drive switches from write mode to read mode or vice versa, the laser is 
switched between high and low power. This adjustment causes a severe change 
in the MIRN signal level to which a defect detector, incorrectly, will react. An 
easy way to deal with this phenomenon is to ignore the defect detector output 
for a short period of time whenever a laser power adjustment takes place. 
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l^ble 5: Detection time difference for ML and direct threshold methods; (for 
negative values ML detection is fester). 
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^ Time difierence unknown since no detection occurred for the direct threshold method 
with the value, required to prevent false alarms. 
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5 Conclusions 

With respect to disc defect classification the following can be concluded: 

• The hierarchy is a suitable disc defect classification structure as discussed 
in Section 3.1 and 3.3. 

• With a combination of signaJ theory and physical insight, an usable map- 
ping of defect measurements into a proper^ space is obtained, which is 
presented in Section 3.2.2. 

• As shown in Section 3.3, the Euclidean distance and Ward linlcage are 
geometric dissimilarity measures that are usable in distinguishing different 
types of disc defects. 

• The developed classification method did not give satisfactory results when 
appUed to other servo signals as became clear in Section 3.3. 

• The fitted polynomials that were used to describe the defect classes, pre- 
sented in Section 3.4.1, are far firom perfect but usable in classifying new 
disc defects as followed £rom Section 3.4.2. 

The following can be concluded about disc defect detection in general and the 
presented approach in particular: 

• Detection is a form of classification or identification as became clear in 
Section 4.1. 

• Causality in on-line detection methods implies a trade-off between detec- 
tion speed and reliability as was shown in Section 4.1.1. 

• The method of maximum likelihood presented in Section 4.2.1, provides a 
manner to obtain an optimal performing detection algorithm with respect 
to speed and reliability. 

• The reference signals describing the obtained defect classes are not suit- 
able for usage in a detection structure with multiple ML detectors, which 
followed &om Section 4.2.2. 

• Disc defect detection was improved by using a single ML detector that 
is based on a *block form' defect reference model like we discussed in 
Section 4.3.1 and 4.3.2. 

• The investigated detection method is very sensitive to signal offsets in the 
MIRN signal as mentioned in Section 4.3.3. 

• On-line disc defect identification based on the defect clustering algorithm 
fix>m Chapter 3 seemed acciirately and fast enough for real time imple- 
mentation as discussed in Section 4.2.3 and 4.3. 
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6 Detector simulation model 

Here we present the most important elements of the Simulink model that is used 
to simulate the beiiavior of the maodmum likelihood defect detector. 
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Figure 29: Root level diagram of the ML defect detector simulation model. 
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Figure 30: FIBrfilter of the ML defect' detector simulation model. 
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Figure 31: Threshold comparator of the ML defect detector simulation model. 
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Figure 32: Mean value calculation of the ML defect detector simulation model. 
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Figure 33: Sample counter of the ML defect detector simulation model. 




Figure 34: Peak value calculation of the ML defect detector simulation model. 
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Figure 35: Amplitude distribution calculation of the ML defect detector simu- 
lation model. 
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Figure 36: Euclidean distance calculation of the ML defect detector simulation 
model. 
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CLAIMS: 



1 . Device comprising: 

means for detecting anomalies in a signal by processing a plurality of samples 
of the signal and by comparing the processed samples with a threshold. 

5 2. Device as claimed in claim 1, further comprising means for identifying a 

particular anomaly among a plurality of predeteraiined anomalies by matching, comparing, 
or the like, said signal with a plurality of reference signals corresponding to said plurality of 
predetermined anomalies. 

10 3 , Method for determining anomalies in a signal, said method comprising: 

processing a plurality of samples of the signal; and 
comparing the processed samples with a threshold. 



4^ Device, system or method substantially as hereinbefore described and/or 

15 shown in the figures included therein. 
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